223 research outputs found

    SMS Keynote: Navigating Millions of Chemicals in Metabolomics and Exposomics Workflows

    Get PDF
    Keynote presentation for the Swiss Metabolomics Society (SMS) Meeting, 15 September 2023 at ETH Zurich. Thanks to Nicola Zamboni for the invitation! This presentation features a soundtrack from Jame Perera, on "Our Chemical Past, Present and Future", which can be downloaded on Vimeo (video) or Soundcloud (sound only). Please leave feedback there if you enjoy it

    Open, Dynamic Databases, Workflows and Transformations to Support Environmental Studies

    Get PDF
    Invited talk for the Environmental Chemistry and Biogeochemistry Seminar at UmeÄ University, 17 March 2023, Virtual Event. Many thanks to Andriy Rebryk for the invitation

    Finding PFAS: Data Exchange to Support Suspect and Non-target Screening of PFAS

    Get PDF
    Invited presentation for the PFAS workshop hosted by BAM, Sept 19 in Berlin / hybrid Advancements of Analytical Techniques for Per- and Polyfluoroalkyl Substances (PFAS) – Second Workshop 2023 https://www.bam.de/Content/EN/Events/2023/2023-09-19-pfas.html Many thanks to Christian Vogel for the invitation! This presentation features a sound track created by Jamie Perera on "Our Chemical Past, Present and Future", which can be downloaded on Vimeo (video) or Soundcloud (sound only). Please leave feedback there if you enjoy it

    Global Challenges: Opening up Chemistry, Pandemics, and Air Pollution

    Get PDF
    As the first half of 2022 comes to a close, it is an interesting time to reflect on some recent trends. In many ways, the world is “opening” up again, with many colleagues going to their first “in person” conferences since the start of the pandemic in early 2020. A significant leap forward for open chemistry was made in 2021, with the Chemical Abstracts Service (CAS) Registry embracing a hybrid model and releasing half a million chemicals as the CAS Common Chemistry set under an open license. (1)ACS Environmental Au continues to develop as one of the key gold open access journals for publishing work on environmental topics. (2) The European Union has just launched the €400 million European Partnership for the Assessment of Risks from Chemicals (PARC), with ∌200 partners (3) and a whole work package on FAIR (Findable, Accessible, Interoperable, Reusable) (4,5) and Open (6) data. While these trends are cause for optimism, the CAS Registry continues to climb toward the 200 million chemical mark (7) and many of us were blown away by the sheer immensity of the chemical pollution problem at recent meetings. Other colleagues, e.g., those affected by war, by lockdowns, or with insufficient funds, are unable to share in the “post-pandemic” reopening, conferences, and travel. Others cannot afford the costs associated with open access or still do not see the benefits of open science. Why the focus on these disjoint subjects? Both chemical pollution and the COVID-19 pandemic are global challenges requiring global solutions, where failure to act comes with a high price. Landrigan et al. estimated that 9 million premature deaths (16% of the global total) were caused by pollution in 2015. (8) Worldwide deaths directly due to the COVID-19 pandemic are already over 6 million (9) (January 2020 to May 2022). While public awareness is high, individuals often feel powerless to tackle global challenges─yet the pandemic has proven that individual actions can make an incredible collective difference. The same applies to open data and the exchange of research results─the collective benefit from many individual contributions can be extraordinary

    Establish data infrastructure to compile and exchange environmental screening data on a European scale

    Get PDF
    Robust techniques based on liquid (LC) and gas chromatography (GC) coupled with high-resolution mass spectrometry (HR-MS) enable sensitive screening, identification, and (semi)quantification of thousands of substances in a single sample. Recent progress in computational sciences has enabled archiving and processing of HR-MS ‘big data’ at the routine level. As a result, community-based databases containing thousands of environmental pollutants are rapidly growing and large databases of substances with unique identifiers allowing for inter-comparison at the global scale have become available. A data-archiving infrastructure is proposed, allowing for retrospective screening of HR-MS data, which will help define the ‘chemical universe’ of organic substances and enable prioritisation of toxicants causing adverse environmental effects at the local, river basin, and national and European scale in support of the European water and chemicals management policy

    MetFrag relaunched: incorporating strategies beyond in silico fragmentation

    Get PDF
    Background: The in silico fragmenter MetFrag, launched in 2010, was one of the first approaches combining compound database searching and fragmentation prediction for small molecule identification from tandem mass spectrometry data. Since then many new approaches have evolved, as has MetFrag itself. This article details the latest developments to MetFrag and its use in small molecule identification since the original publication.Results: MetFrag has gone through algorithmic and scoring refinements. New features include the retrieval of reference, data source and patent information via ChemSpider and PubChem web services, as well as InChIKey filtering to reduce candidate redundancy due to stereoisomerism. Candidates can be filtered or scored differently based on criteria like occurence of certain elements and/or substructures prior to fragmentation, or presence in so-called “suspect lists”. Retention time information can now be calculated either within MetFrag with a sufficient amount of user-provided retention times, or incorporated separately as “user-defined scores” to be included in candidate ranking. The changes to MetFrag were evaluated on the original dataset as well as a dataset of 473 merged high resolution tandem mass spectra (HR-MS/MS) and compared with another open source in silico fragmenter, CFM-ID. Using HR-MS/MS information only, MetFrag2.2 and CFM-ID had 30 and 43 Top 1 ranks, respectively, using PubChem as a database. Including reference and retention information in MetFrag2.2 improved this to 420 and 336 Top 1 ranks with ChemSpider and PubChem (89 and 71 %), respectively, and even up to 343 Top 1 ranks (PubChem) when combining with CFM-ID. The optimal parameters and weights were verified using three additional datasets of 824 merged HR-MS/MS spectra in total. Further examples are given to demonstrate flexibility of the enhanced features.Conclusions: In many cases additional information is available from the experimental context to add to small molecule identification, which is especially useful where the mass spectrum alone is not sufficient for candidate selection from a large number of candidates. The results achieved with MetFrag2.2 clearly show the benefit of considering this additional information. The new functions greatly enhance the chance of identification success and have been incorporated into a command line interface in a flexible way designed to be integrated into high throughput workflows. Feedback on the command line version of MetFrag2.2 available at http://c-ruttkies.github.io/MetFrag/ is welcome

    Exploring open cheminformatics approaches for categorizing per-and polyfluoroalkyl substances (PFASs)

    Get PDF
    Per- and polyfluoroalkyl substances (PFASs) are a large and diverse class of chemicals of great interest due to their wide commercial applicability, as well as increasing public concern regarding their adverse impacts. A common terminology for PFASs was recommended in 2011, including broad categorization and detailed naming for many PFASs with rather simple molecular structures. Recent advancements in chemical analysis have enabled identification of a wide variety of PFASs that are not covered by this common terminology. The resulting inconsistency in categorizing and naming of PFASs is preventing efficient assimilation of reported information. This article explores how a combination of expert knowledge and cheminformatics approaches could help address this challenge in a systematic manner. First, the “splitPFAS” approach was developed to systematically subdivide PFASs (for eventual categorization) following a CnF2n+1–X–R pattern into their various parts, with a particular focus on 4 PFAS categories where X is CO, SO2, CH2 and CH2CH2. Then, the open, ontology-based “ClassyFire” approach was tested for potential applicability to categorizing and naming PFASs using five scenarios of original and simplified structures based on the “splitPFAS” output. This workflow was applied to a set of 770 PFASs from the latest OECD PFAS list. While splitPFAS categorized PFASs as intended, the ClassyFire results were mixed. These results reveal that open cheminformatics approaches have the potential to assist in categorizing PFASs in a consistent manner, while much development is needed for future systematic naming of PFASs. The “splitPFAS” tool and related code are publicly available, and include options to extend this proof-of-concept to encompass further PFASs in the future

    Joint structural annotation of small molecules using liquid chromatography retention order and tandem mass spectrometry data

    Get PDF
    Abstract Structural annotation of small molecules in biological samples remains a key bottleneck in untargeted metabolomics, despite rapid progress in predictive methods and tools during the past decade. Liquid chromatography–tandem mass spectrometry, one of the most widely used analysis platforms, can detect thousands of molecules in a sample, the vast majority of which remain unidentified even with best-of-class methods. Here we present LC-MS2Struct, a machine learning framework for structural annotation of small-molecule data arising from liquid chromatography–tandem mass spectrometry (LC-MS2) measurements. LC-MS2Struct jointly predicts the annotations for a set of mass spectrometry features in a sample, using a novel structured prediction model trained to optimally combine the output of state-of-the-art MS2 scorers and observed retention orders. We evaluate our method on a dataset covering all publicly available reversed-phase LC-MS2 data in the MassBank reference database, including 4,327 molecules measured using 18 different LC conditions from 16 contributors, greatly expanding the chemical analytical space covered in previous multi-MSscorer evaluations. LC-MS2Struct obtains significantly higher annotation accuracy than earlier methods and improves the annotation accuracy of state-of-the-art MS2 scorers by up to 106\%. The use of stereochemistry-aware molecular fingerprints improves prediction performance, which highlights limitations in existing approaches and has strong implications for future computational LC-MS2 developments

    Reaction Data in PubChem

    Get PDF
    A presentation at the enviPathPlus workshop (held online) on "Reaction Data in PubChem". Presented by E. Schymanski on behalf of all authors. See slides for details and many hyperlinks - thanks to Kathrin Fenner for the opportunity
    • 

    corecore